THE PROBLEM
The Project
The National Aeronautics and Space Administration (NASA) High Density Vertiplex (HDV) subproject conducts flight tests within an air traffic management system that the team has created. This aspect of Urban Air Mobility (UAM) is the anticipated high density urban operations which requires many aircraft to operate simultaneously. This will take air traffic density far beyond what it is in current urban airport requiring automation as part of the system. Therefore this team conducts simulation and live flight tests of automated technologies to serve the broader UAM effort that incorporates both automation and human roles. The team is made up of members from both NASA-Ames in Mountain View, CA, and NASA-Langley in Hampton, VA, and its responsibility is to create an air traffic management system and test autonomous vehicles in that system, which they do by conducting flight tests on a periodic basis which creates a great deal of data for researchers to analyze in order to get to the next phase of development This project looks at workflow and then conducted a usability test on a communication tool to improve team situation awareness.My Role
For this project, as a human factors intern for the NASA-Ames Autonomous Vehicle Application Lab (AVAL), I was assigned to create a data visualization tool for the HDV team and worked closely with data manager Kathryn Chapman and others. I conducted sem-structured interviews of key members of the target audience to assess workflow, created affinity maps, personas and a technical report based on the qualitative research followed by an Adobe XD prototype, usability testing and a report on the quantitative data. .The Challenge
A high level of team situation awareness is required to allow the many roles and people playing those roles to be aware of what steps have been completed so the next one can be started in a timely fashion that can keep the simulation or live test running in a seamless fashion. Qualitative user interviews revealed that 60 percent of the participants saw a gap in communications. That was a key problem that needed to be solved so this team created an Adobe XD prototype of a proposed tool to be used in the next live flight scheduled for March 2023. This report includes workflow analysis and usability testing results for this prototype. Usability testing is a method for understanding usability of a digital product or website by asking participants to do certain critical tasks to examine the usability of an interface. It is an easy, inexpensive and repeatable process that is scientifically shown to reveal problems that the target audience is having with an interface as well as areas that are working well for the user. This activity aligns with the fifth objective in the project plan which says “create a plan to research, test and develop a tool by November 2022.The Tools
I used the following software to complete this project: Google Suite, Adobe XD, Microsoft Teams, Microsoft Word, Microsoft PowerPoint, Sublime Text, and Cyberduck.DISCOVERY
Design, discovery and validation
One of the key parts of discovery is learning about the target audience for the product. This informs the design and validation of the product. The target audience for this tool are researchers and tech leads on the HDV team which is made up of three main groups: Airspace Systems Integration (ASI), Vehicle Vertiport Systems Integration (VVSI) and Flight Operations (FO). These user profiles were developed using semi-structured interviews from members of these three groups to gather data as directed by the Oracle user profile template.
UX RESEARCH
Interviews
The first step is understanding the users' frustrations, motivations and goals while assessing workflow to determine existing problems.. I created an Interview Script and interviewed 8 people from my target audience which includes both researchers and tech leads working in California and Virginia Orange. Immediately after the interviews I summarized the interviews and then wrote a qualitative technical report to synthesize my findings.
The high-level findings indicated that the users wanted to:
Data visualization & processing:
- • A FO researcher was unable to discuss reasons for why an autonomous vehicle was failing to increase elevation as it headed for a line of trees when a safety pilot on the ground brought the concern to him. Only after reviewing logs and screen recordings after the safety pilot brought the vehicle down, was he able to realize the automation switched the mode from mission to onboarding which caused the elevation to level off. Being able to see the data from the logs and screen recordings in visual form would have helped this researcher to understand the reason for the elevation change and communicate that to the safety pilot on the ground.
- • A human factors researcher had difficulty in processing qualitative interview and survey data after receiving it. New tools and methods are needed as this process took him months for the AOA Flight Test.
- • A research lead said the most difficult part of the process was data processing including organizing, reconciling and processing qualitative data.
Communication:
- • VVSI tech lead said a lack in transparent communication about a preceding step being accomplished made his team late for their step. This lag caused the system to log out and restarting the system caused lateness in accomplishing their step in the process.
- • The Flight Test Director (FTD) used both Teams chat and audio to inform all the different roles of when they should complete their steps. Some researchers found the noise and messaging distracting from what they were meant to do and hindrance in accomplishing it.
- • Another NASA-Ames researcher did not have any indication when she should accomplish her step. Her only indication was listening to the FTD and watching Teams plus the knowledge to shut down the vertiport somewhere around Snoopy’s head on the display map.
Persona & empathy map
Once I had completed my research, the next step was to understand the data so I created a set of affinity maps and from that created a set of personas based on those interviews. My analysis showed two clear customer types which I named Sally the researcher and Sam the tech lead. My analysis also showed that the clear problem that I needed to solve was the communication gap between team members and the data visualization tool took a back seat to this more critical priority.
DESIGN
Prototype
This team is made of many smaller segments over the two NASA bases including members in the Autonomous Vehicles Application Lab (AVAL), Airspace Operation Lab (AOL), and Remote Operations for Autonomous Missions (ROAM); the first two are in California and the latter in Virginia. ROAM monitor live and simulated flight tests which are conducted at Langley Research Center and the flight test director’s created a checklist to keep the multiple people who have tasks that will impact other members’ roles on task in a timely fashion and to create a sense of team situation awareness. I used this list as a foundation for my prototype which I tested with members from the larger HDV team including members from ROAM, AOL, and AVAL. Try out the prototype here
TESTING
Methods
Our team of one human factors graduate student and NASA HDV data manager worked together to compile the information contained in this report. The human factors graduate student created a usability test plan with the following six steps:
- 1. Determine User Goals: Determined via User Needs Analysis of n = 6 HDV employees
- 2. Determine Customer Requirements: Determined from information as a result of a series of qualitative user interviews
- 3. Determine Goals of Evaluation: Determine how well the existing iteration of the prototype provides readily-available, easily-accessible and important information to current and researchers (task performance). Identify usability issues that hinder access to relevant information or cause frustration in those using this tool (user satisfaction).
- 4. Determine Evaluation Methods: Usability evaluation
- 5. Detail Methods Being Used: : Moderated formative usability testing (appropriate for iterative design, limited participants, and quick turnaround time)
- 6. Identify test metrics
- • Time on task
- • Task success rate
- • Self-reported task difficulty
- • System usability scale for Websites (SUS)
User needs analysis
The first step in the process was to perform the user needs analysis, which determines who your users are, their characteristics, what demands the system will put on them and what performance criteria you will evaluate. This is an outline of the users for the HDV communication tool:
- 1. Identify the different classes of users (flight test director, researchers, tech leads, developers.).
- • Researchers for (ASI) and Fleet Operations (FO) teams
- • Tech leads for ASI, VVSO and FO teams
- • Flight test director
- • Developers for ASI, VVSO and FO teams
- 2. Characteristics of users (that will affect how they use/interact with product)
- • Gender: Male or female
- • Age: Between 18 and 35 years old
- • Expertise: Worked on High Density Vertiplex (HDV) subproject
- • Frequency of usage: None
- 3. Operational procedures
- • Determined by the users
- 4. Performance criteria (speed, accuracy, quality, consequences of not meeting criteria)
- • Completion time
- • Accuracy
- 5. Task demands (physical, perceptual, cognitive, health and safety)
- • Physical, perceptual, cognitive, health and safety
- 6. Environment where product will be used
- • Over Teams call with training provided beforehand
- 7. Availability of technical assistance
- • None except what can be found on prototype
Customer requirements
The second step in the process is to understand the context in which the tool will be used in. The requirements were set by the customer and written together by both Kathryn and Shraddha.
- Use case: Researchers and tech leads involved in a HDV flight test or simulation said a lack in transparent communication about a preceding step being accomplished could delay them in the step he/she is responsible for. This lag caused the system to log out and restarting the system caused lateness in accomplishing their step in the process.
- • Create a communication tool that allows researchers to the status of what step is being conducted in the live flight test or simulation.
- • This tool shall cue researchers to engage specific tasks as appropriate for their role.
- • This tool shall also tell the researcher how preceding steps affects their own tasks.
- • This tool shall not distract or interfere with research activities.
Goals of evaluation
Determine how well the existing iteration of the prototype provides readily-available, easily-accessible and important information to current and researchers (task performance). Identify usability issues that hinder access to relevant information or cause frustration in those using this tool (user satisfaction).
Method being used
Efficiency, effectiveness and usability of this product will be evaluated using a moderated formative usability test which is appropriate for iterative design, limited participants and quick turnaround time.
Metrics being used
- • Time on task
- • Task success rate
- • Self-reported task difficulty
- • System usability scale (SUS) for websites
RESULTS
Participant demographics are presented first, followed by summary data, and a detailed analysis of the three least problematic tasks and the three most problematic tasks.
Participant Demographics
- • Gender: Female (2), Male (4)
- • Ages: 20-30 (4); 31-40 (2)
- • Roles: Tech lead (1); Researchers (4); Developer (1)
- • Experience: 0 to 5 years (4); 6 to 10 years (1); 11 to 15 years (1); 16+ years
- • Base: NASA-Ames (4); NASA-Langley (2)
- • HDV Flight or Simulation experience: 0 to 5 years (4); 6 to 10 years (0); 11+ years (1); Other (0)
- • Team situation awareness: (1=weak and 5 is strong): 1 (0); 2(0); 3(3); 4(2); 5(1);
Quantitative results
An overview of the data for each metric utilized to determine usability of the Website is as follows:
System Usability Scale
The SUS is a self-report measure that asks participants 10 questions relating to their perceptions of the communication tool prototype. The odd-numbered questions are positively-worded and the even-numbered questions are negatively-worded. The SUS is scored as follows:
- • Odd-numbered questions: subtract 1 from the participant’s rating;
- • Even-numbered questions: subtract the participant’s rating from 5;
- • Add up the new scores then divide by 2.5;
- • Scores range from 0 – 100;
- • Scores < 50 = unacceptable usability;
- • Scores 50 – 70 = marginal usability;
- • Scores > 70 = acceptable usability.
The SUS score for each of our participants appears below. The green bar represents the minimum score required to obtain acceptable usability. Here, the average SUS score was almost 84, which is well above acceptable usability.
Completion time, accuracy rate and self-reported task difficulty
Data are summarized by task in the table below
Task | Average time (in seconds) spent on task | Binary success rate | 95% Confidence Interval | La Place Point estimation | Average reported task difficulty |
---|---|---|---|---|---|
1 | 22.20 | 1 | .64 - 1.00 | .88 | 4.83 |
2 | 58.28 | 1 | .64 - 1.00 | .88 | 4.33 |
3 | 84.68 | 1.3 | .23 - .91 | .63 | 4.25 |
4 | 19.61 | 1 | .64 - 1.00 | .88 | 4.00 |
Time on Task Summary Data
To find the central point of the time on task data, we used the geometric mean, as it is the most appropriate when analyzing centrality of highly skewed and variable data such as time-related data. The optimal task time, range, and standard deviation are also provided in the chart below.
Participant | Task 1 | Task 2 | Task 3 | Task 4 |
---|---|---|---|---|
1 | 30 | 105 | 130 | 35 |
2 | 29 | 45 | 27 | 14 |
3 | 15 | 29 | 420 | 7 |
4 | 18 | 27 | 41 | 7 |
5 | 14 | 107 | 107 | 37 |
6 | 82 | 99 | 57 | 64 |
Geomean | 22.20 | 58.28 | 84.68 | 19.61 |
Optimal Time | 3.97 | 5.90 | 4.48 | 5.50 |
Range | 14-82 | 27-107 | 27-420 | 7-64 |
Standard Deviation | 25.77 | 38.93 | 147.30 | 22.37 |
Task key
- Task #1: Log in as a GCSO researcher and obtain security clearance
- #2: Find the upcoming task for Range Safety Officer
- #3: Find upcoming task for yourself, the GCSO researcher
- #4: Check off the next test for yourself, the GCSO researcher, to let the team know that step is complete.
The two tasks with the highest geomean completion times were as follows:
- • Task 2 (Find the upcoming task for Range Safety Officer): 58.28 seconds;
- • Task 3 (Find upcoming task for yourself the GCSO researcher): 84.68 seconds.
The above summary results allowed our team to easily identify the single most problematic task and the two least problematic tasks, each of which are discussed in detail below. Because our test tasks mainly related to finding information on the prototype tool, we identified Time on Task, and Success Rate as the most important measures on which to focus our detailed evaluation.Listed below are the three least problematic tasks. As these tasks had relatively high success rates, relatively low task times, and low difficulty ratings, the remainder of this report will focus on the one task that we deemed to be the most problematic.
Three Least Problematic Tasks
To find the central point of the time on task data, we used the geometric mean, as it is the most appropriate when analyzing centrality of highly skewed and variable data such as time-related data. The optimal task time, range, and standard deviation are also provided in the chart below.
Task | Success rate | Geomean time on task (seconds) | Optimal time* (seconds) | Mean difficulty rate (out of 5) |
---|---|---|---|---|
#1: Log in as a GCSO researcher and obtain clearance | 6/6 | 22.20 | 3.97 | 4.83 |
#2: Find the upcoming task for Range Safety Officer | 6/6 | 58.28 | 5.90 | 4.33 |
#4: Check off the next test for yourself, the GCSO researcher, to let the team know that step is complete. | 6/6 | 19.61 | 5.50 | 4.00 |
*Note: Optimal time was determined by team members who were familiar with the fastest way to accomplish the task. Participants were unfamiliar with where to find the information and were also thinking out loud as they were completing the tasks. As such, the discrepancies between the optimal time and participants’ actual times were expected.
Single Most Problematic Task
As stated above, our team focused on Time on Task and Success Rate as the most important metrics on which to focus our detailed evaluation of the single most problematic task which was Task #3.
Task | Success rate | Geomean time on task (seconds) | Optimal time* (seconds) | Mean difficulty rate (out of 5) |
---|---|---|---|---|
#3: Find upcoming task for yourself, the GCSO researcher. | 4/6 | 84.68 | 4.48 | 4.25 |
Task # 3: Find upcoming task for yourself, the GCSO researcher
- Success rate: 4/6; (One participant could not find this information immediately and eventually made it to the My Tasks tab to complete the activity but exceeded the 2 minute time limit; the other participant did find the information but also exceeded the time limit; the other four found this information on the Home Page instead of the My Tasks page.
- Geomean task time: 84.68 seconds
- Optimal time: 4.48 seconds
- Difficulty score: 4.25/5.0
- • “So that's also something that I guess wasn't maybe obvious to me. ”
- • “I would think that the interface would make it more clear which ones are mine versus theirs, right? Cause that could be confusing if there's more than one.”
- • “OK, so I would want to check this (on the Home Page). I would try to click here. But that doesn't work, so I need to go over here to My Tasks.”
Half of the subjects saw the upcoming tasks for the GCSO researcher on the Home page and either didn’t see the My Tasks button or were waiting for it to light up to go there. One person did eventually get there and the other did not. The participant did not get there but did go there for Task 4 when was trying to check off the task as completed. The information about the next task can be found on the Home Page but it is static. The only way to interact with it is to go to the My Tasks page by clicking the My Tasks link in the navigation.
Four participants were able to complete this, but the majority expressed frustration at having to go to a whole new page to see different tasks based on different roles: